Generating Expressions Referring to Eventualities
نویسندگان
چکیده
We note (a) the well-rehearsed linguistic observation that eventualities can be referred to by using either noun phrases or sentences, and (b) the seductive ontological parallels drawn by Bach [1986] between eventualities and individuals. We show how the mechanisms for knowledge representation and referring expression generation in an existing natural language generation system [Dale 1988, 1989] can be easily extended to combine these two insights in the generation of a wide variety of forms of reference to eventualities. Referring to Eventualities Most domains consist both of things and events. In order to communicate effectively, it is necessary to be able to generate appropriate references to ordinary physical objects, like chairs, cats and carrots, and to eventualities, like being in love, chopping carrots and having two hours of negotiations. A natural language generation system will thus have to encode information about both things and eventualities in its underlying knowledge base (kb) (cf. [Kowalski and Sergot 1986; Moens and Steedman 1987]). Now, consider the following examples: (1) a. John and Bill discussed the paper. b. John and Bill had a discussion about the paper. (2) a. Caesar died slowly. b. Caesar’s slow death upset Mary. The data indicate that very similar information about eventualities can be conveyed via either sentences or noun phrases. In the context of theoretical linguistics, Chomsky [1970] observed that to avoid problems arising from syntactic parallelism, verbs and their derived Also of the Human Communication Research Centre at the University of Edinburgh. hcrc is supported by the uk Economic and Social Research Council. The work was also supported in part by the uk Science and Engineering Research Council through project grant gr/g22077. Email: [email protected] Also of the Department of Artificial Intelligence at the University of Edinburgh. Email: [email protected] nominals could be made to share lexical entries. In the context of nl understanding, Dahl et al [1987] urge the desirability of returning the same semantic representation when interpreting clauses and their nominalizations. In a generation system, we want to be able to generate both syntactic forms. It is convenient, for instance, to be able to refer anaphorically to an eventuality (introduced by a clause) using a pronoun, as in this example from Schuster [1988:602]: (3) I want to move a block of text as a unit. How do I do it? On this basis, we may impose the following condition on a generation system. Desideratum 1: we must select a sufficiently neutral kb format, and provide sufficiently strong mapping rules from kb entities to surface strings to allow the generation of both forms. We urge below that this in turn leads us towards another relevant desideratum. Eventualities and the Mass–Count Distinction The space of possible eventualities possesses considerable structure, and this structure has been taxonomised in various ways. Following Vendler [1967], much consideration has been given to the ‘aspectual types’ of utterances of English sentences (cf. [Mourelatos 1978; Hinrichs 1986; Dowty 1986; Moens and Steedman 1987]). Bach [1986] takes the space to include states and non-states; in turn, states consist of dynamic and static states, while non-states consist of processes and events. Events are then either protracted or momentaneous; momentaneous events are either happenings or culminations. The issues involved in nominalization are broad; some provisos are therefore in order. We here consider only nps which explicitly refer to eventualities; thus we do not discuss nominals with ‘hidden events’ in their analyses (cf. [Pustejovsky and Anick 1988]). We do not here consider the relation between tense and aspectual systems and nps referring to eventualities. Further, although we pursue a relationship between nominal reference and eventualities, there is no direct relation to Partee [1984], where a drt treatment is given of the thesis that tense is an expression allowing anaphoric reference to times. Putting states to one side, we wish here to focus on Bach’s observation of the seductive parallels between the mass–count distinction in nominal systems and the aspectual classification of verbal expressions. Bach exploited Link’s [1983] treatment of nominal systems in the context of verbal expressions. In particular, he expounded ‘this proportion: events: processes :: things: stuff’ [1986:5]. The algebra of events and processes derived has a number of appealing features, and explains why ‘temporal mass’ indicators, like twice last night, occur with some vps but not others, as well as indicating under what circumstances we can switch from a count-expression to a mass-expression, and vice versa. However, Bach does not discuss a class of expressions which share some of the features of both sides of the analogy, and these are precisely the nps which refer to eventualities we adverted to above. Further examples include: war, concert, discussion and the destruction of Carthage by the Romans. We would argue that just as Bach’s interpretation of the mass–count distinction applies to verb phrase reference to eventualities, so too it applies to nominal reference to eventualities; and that the distinction should again be understood in terms of a process–event division. For example: (4) I had a discussion with Fred. (5) I had two discussions with Fred. (6) I had a lot of discussion with Fred. (7) I had two hours of discussion with Fred. Here, (4) and (5) represent the use of discussion as a count noun, appearing with an indefinite and a number expression respectively; (6) and (7) represent its use as a mass noun, appearing with the quantifier a lot of and a measure expression respectively. Lewis’s [1983] ‘Universal Grinder’, which converts a count expression into a mass one, seems to apply rather easily to eventualities, and, following Bach, we may say that we have moved from the discrete event of discussion into the process stuff that composes it. Since grinding and packaging seem to be zero-morphological operations, leaving no trace of their operation in lexical realisations, it is perhaps unsurprising that English should contain many eventuality-denoting words that appear both as count nouns and mass nouns; killing, for example, can co-occur both with mass-measures and count-measures. We therefore wish to place a new condition to our generation system. Desideratum 2: if our generation system is to produce reference to eventualities, it must be able to generate both mass and count eventuality expressions. Given the two desiderata, we want to be able to generate sentential and noun phrase references to eventualities, in the latter case allowing singular, mass or plural expressions. We demonstrate below that this is possible by relatively simple augmentation of the In a fuller analysis, we would argue that states are to be represented via objects possessing temporally-constrained properties. ontological structures and generation mechanisms in an existing natural language generation system [Dale 1988, 1989]. The Representation of Entities Dale’s [1988, 1989] epicure system contains knowledge base structures appropriate for generating reference to count, mass and plural reference to things. The system makes use of a notion of a generalized physical object or physobj. This permits a consistent representation of entities irrespective of whether they are viewed as individuals, masses or sets, by representing each as a knowledge base entity (kbe) with an appropriate structure attribute. To construct a referring expression corresponding to a kbe, we first build a deep semantic structure which specifies the semantic content of the noun phrase to be generated. We call this the recoverable semantic content, since it consists of just that information the hearer should be able to derive from the corresponding utterance, even if that information is not stated explicitly: in particular, elided elements and instances of one-anaphora are represented in the deep semantic structure by their more semantically complete counterparts. From the deep semantic structure, a surface semantic structure is then constructed. Unlike the deep semantic structure, this closely matches the syntactic structure of the resulting noun phrase, and is suitable for passing directly to a patr-like unification grammar. It is at the level of surface semantic structure that processes such as elision and one-anaphora take place. The transitions from kbe to deep semantic structure (the issue of content determination) and from deep semantic structure to surface semantic structure (the choice of anaphoric strategies and broad linguistic choice) are performed by means of independent sets of mapping rules which rewrite the appropriate structures from one level into those of the other. Dale’s [1988] system encoded only rudimentary information about eventualities, although still in the same general form as the encoding of physobjs. Eventualities were treated simply as operators which change the properties of objects. Hence (as appropriate to the sublanguage chosen for the system), eventualities appeared as (imperative) sentences, such as peel and slice the onions or simmer the soup. Effectively, they were treated, in accordance with situation-calculus based planning, as transitions from global state to global state. However, by representing eventualities as the (near) equals of physical objects, we can satisfy both of our desiderata. Below, we extend the parallels between the two types of entities. There is insufficient space in the present paper to provide examples of these two levels of representation: see Dale [1988, 1989] for examples. Common Features of Physobjs and Eventualities The domain in which the system operates consists of a finite set of entities. Each entity is represented by a distinct symbolic constant called its index. An entity is either a physobj or an eventuality, where these are defined as follows: a physobj is any (not necessarily contiguous) collection of contiguous regions of space occupied by matter; a eventuality is any (not necessarily contiguous) collection of contiguous regions of time occupied by process stuff. Physobjs have spatial parts; if a physobj can be decomposed into parts, those parts will be physobjs. Eventualities have temporal parts; if an eventuality can be decomposed into parts, those parts will be eventualities. Every entity has, in addition to its index and type, a specification. The specification of an entity provides all the information known to the system about that entity. An entity may have as part of its specification a location. In a physobj, this will typically be its spatial location; in an eventuality, this will typically be its temporal location, given via begin and end points (which may coincide). In both cases, the system need not know any value for the location. However, every entity must have, as part of its specification, a substance. In the case of physobjs, the substance is the kind of matter from which the object is made; in the case of eventualities, the substance is the kind of process stuff that makes up the eventuality. There is a finite but extensible set of substances and process stuffs represented within the system by means of symbolic constants, which are organised into a taxonomic graph structure. As a further part of its specification, every entity has a structure. This corresponds to the way in which the entity is perceived. Whether the entity is a physobj or an eventuality, its structure is either individual, set or mass. An eventuality is treated on a par with a physobj. Whether it is referred to as a process or an event is determined by whether its structure is mass or individual. Any entity may have any number of additional properties specified as part of its specification, where those additional properties are drawn from a finite but extensible set of properties. These properties are binary valued features with + and − being the possible values. For example, for either Caesar died slowly or Caesar’s slow death, we would expect the kb specification to include [slow = +] as part. If an entity has structure individual, then it also has a packaging as part of its specification. A packaging is a tuple consisting of a shape and a size. The possible values of shape and size are each drawn from two finite but extensible sets: one in each case relevant to physobjs, the other to eventualities. If an entity is a mass, it may or may not have a quantity. If the We will not discuss here the obvious benefits of this generalisation for anaphoric reference to eventualities. index = x0 stage = s0 spec = structure = set
منابع مشابه
The Prevalence of Descriptive Referring Expressions in News and Narrative
Generating referring expressions is a key step in Natural Language Generation. Researchers have focused almost exclusively on generating distinctive referring expressions, that is, referring expressions that uniquely identify their intended referent. While undoubtedly one of their most important functions, referring expressions can be more than distinctive. In particular, descriptive referring ...
متن کاملGenerating referring expressions containing quantifiers
Recent work on the Generation of Referring Expressions has increased the generating capability of algorithms in this area. This paper asks whether the models underlying these proposals can still be used if even more complex referring expressions are generated. To discuss this issue, we will investigate a variety of referring expressions that pose difficulties to current generation algorithms. I...
متن کاملOSU-2: Generating Referring Expressions with a Maximum Entropy Classifier
Selection of natural-sounding referring expressions is useful in text generation and information summarization (Kan et al., 2001). We use discourse-level feature predicates in a maximum entropy classifier (Berger et al., 1996) with binary and n-class classification to select referring expressions from a list. We find that while mention-type n-class classification produces higher accuracy of typ...
متن کاملGenerating One-Anaphoric Expressions: Where Does the Decision Lie?
Most natural language generation systems embody mechanisms for choosing whether to subsequently refer to an already-introduced entity by means of a pronoun or a definite noun phrase. Relatively few systems, however, consider referring to entites by means of one-anaphoric expressions such as the small green one. This paper looks at what is involved in generating referring expressions of this typ...
متن کاملGenerating Expressions that Refer to Visible Objects
We introduce a novel algorithm for generating referring expressions, informed by human and computer vision and designed to refer to visible objects. Our method separates absolute properties like color from relative properties like size to stochastically generate a diverse set of outputs. Expressions generated using this method are often overspecified and may be underspecified, akin to expressio...
متن کامل